Lecture 3: Portfolio
Optimization
Risk Aversion
- We now know from our historical data that risky asset returns tend
to beat safe asset returns
- Since there’s a positive “premium” for risk, how much risk are you
willing to bear?
- This depends on risk tolerance
Preview of our goal
- Solve two problems:
- How should we distribute our wealth between a risky portfolio of
assets and the risk‐free asset based on our risk aversion?
- What portfolio of risky assets should we hold?
- With many risky assets, there is an optimal portfolio of these
assets that all investors prefer
- Lectures on active management will address departures from this
optimal risky portfolio
Risk, Risk Aversion and Gambles
- Consider the following gamble (where you pay me a dollar):
- With probability 0.5, I give you 2 dollars
- With probability 0.5, I give you 50 cents
- What are the expected returns? Variance?
Risk, Risk Aversion and Gambles
Consider the following gamble (where you pay me a dollar):
- With probability 0.5, I give you 2 dollars
- With probability 0.5, I give you 50 cents
What are the expected returns? Variance?
\[
\begin{aligned}
E(r_{\text{gamble}}) &= 0.5 \times (2-1) + 0.5 \times (0.5 - 1) =
0.25\\
\sigma^{2}(r_{\text{gamble}}) &= 0.5 \times (1 - 0.25)^2 + 0.5
\times (-0.5-0.25)^2 = 0.5625 \\
\sigma(r_{\text{gamble}}) &= \sqrt{ 0.5625 } = 0.75
\end{aligned}
\]
Would you take it? What if you paid me 1.25?
At what cost would you become unwilling to take this deal?
- Known as the “certainty equivalence”
Risk, Risk Aversion and Gambles
- Ok, now consider another gamble (where you pay me a dollar):
- With probability 0.4, I give you 3 dollars
- With probability 0.2, I give you 1 dollar
- With probability 0.4, I give you 25 cents
- What are the expected returns? Variance? \[
\begin{aligned}
E(r_{\text{gamble}}) &= ?\\
\sigma^{2}(r_{\text{gamble}}) &= ?\\
\sigma(r_{\text{gamble}}) &= ?
\end{aligned}
\]
Preferences
- Let’s assume a simple model of preferences over returns and
risk.
\[
U(r) = E(r) - \frac{1}{2} \times A \times \sigma^{2}(r)
\] * \(A>0\) implies risk
aversion * \(A = 0\) implies risk
neutrality * \(A < 0\) implies risk
seeking
U(r) provides an ordering over investments,
given their returns and risks
This particular function can be derived under a number of
different assumptions and is widely used by practitioners and
academics
Preferences
\[
U(r) = E(r) - \frac{1}{2} \times A \times \sigma^{2}(r)
\]
For \(A = 1\):

Preferences
\[
U(r) = E(r) - \frac{1}{2} \times A \times \sigma^{2}(r)
\]
For \(A = 0.5\):

Preferences
\[
U(r) = E(r) - \frac{1}{2} \times A \times \sigma^{2}(r)
\]
For \(A = 0.0\):

Preferences
\[
U(r) = E(r) - \frac{1}{2} \times A \times \sigma^{2}(r)
\]

Allocating between 1 risky and 1 riskless asset
- Back to our first example:
- The risky asset’s return is \(r_{p}=0.25\), \(\sigma = 0.75\)
- and there is a riskless asset’s with return \(r_{f} = 0.03\), \(\sigma = 0\)
- Utility (\(U(r)\)) across
assets:
| Risk Free |
0.03 |
0.03 |
0.03 |
0.03 |
| Risky |
0.24 |
0.11 |
0.03 |
-0.03 |
For risk aversion of 0.5, do you prefer \(r_{f}\) or investment \(r_{p}\)?
Suppose the constant of risk aversion is 0.78?
Now what about if you can choose to invest in both assets, but
with different weights?
What weight (\(w\)) should you
allocate to the risky asset?
A real example
Consider a vaccine CEO
What would we expect to see in reality?
Expected returns from a blended portfolio
- Expected returns on a portfolio combining the risky \((p)\) and risk-free asset asset are:
\[
\begin{aligned}
E(r_{\text{blended}}) & = E\big(wr_{p} + (1-w)r_{f}\big)\\
& = wE(r_{p}) + (1-w)E(r_{f})\\
& = r_{f} + wE(r_{p}-r_{f})
\end{aligned}
\]
- Note, expected returns on the complete portfolio are equal to two
parts:
- the risk free return (\(r_{f}\))
- the compensation for exposure to the risk in the risky asset (\(E(r_{p} - r_{f})\))
- Sometimes referred to as the “premium”
Return variance from a blended portfolio
- The variance of returns from the blended portfolio is:
\[
\begin{aligned}
\sigma^{2}(r_{\text{blended}}) &= \text{Var}\big(wr_{p} +
(1-w)r_{f}\big)\\
&= w^{2}\sigma^{2}(r_{p}) +
(1-w)^{2}\sigma^{2}(r_{f}) + 2w(1-w)\sigma(r_{p}, r_{f})\\
&= w^{2}\sigma^{2}(r_{p})\\
\sigma(r_{\text{blended}})&= w
\sigma(r_{p})
\end{aligned}
\]
- When we combine these two equations, we get the “capital
allocation line”
\[
E(r_{\text{blended}}) = r_{f} + \sigma(r_{\text{blended}})\frac{E(r_{p}
- r_{f})}{\sigma_{p}}
\]
Capital Allocation Line
- When we combined these two equations, we get the “capital
allocation line”
\[
E(r_{\text{blended}}) = r_{f} + \sigma(r_{\text{blended}})\frac{E(r_{p}
- r_{f})}{\sigma_{p}}
\]
\[
E(r_{\text{blended}}) = r_{f} + \sigma(r_{\text{blended}})\frac{E(r_{p}
- r_{f})}{\sigma_{p}} = 0.03 + \sigma(r_{\text{blended}})
\frac{0.22}{0.75}
\]
The capital allocation line shows all risk‐return combinations
available based on choice of w.
The slope of the capital allocation line (the Sharpe Ratio)
prices the risk-return tradeoff
In our example, the Sharpe Ratio is 0.29 (0.22/0.75)
Capital Allocation Line
\[
E(r_{\text{blended}}) = r_{f} + \sigma(r_{\text{blended}})\frac{E(r_{p}
- r_{f})}{\sigma_{p}}
\]

Choosing w
- Which risk return combination do we want?
- Find point where utility cannot move any further to the northwest,
but still touches the CAL

Choosing w
- We can solve this problem more formally as
\[
\max_{w} U(r_{blend}) = r_{f} + w E(r_{p} - r_{f}) - \frac{1}{2} A w^{2}
\sigma^{2}(r_{p})
\]
- This is solved by finding the value of w that sets the derivative
equal to zero:
\[
w^{*} = \frac{E(r_{p} - r_{f})}{A\sigma^{2}(r_{p})}
\]
Choosing w
| 0.25 |
1.56 |
0.37 |
1.17 |
| 0.5 |
0.78 |
0.2 |
0.51 |
| 0.78 |
0.49 |
0.14 |
0.37 |
| 1 |
0.39 |
0.12 |
0.29 |
- What is the meaning of 1.56 in this table? Can you ever get a
negative w?
- How are the standard deviations and expected returns for these
optimal portfolios computed?
Taking on more risk
Why would someone want
leverage?
Two Risky Assets
Now suppose instead of choosing a mix between a risky and a
risk‐free asset, we have two risky assets (but no risk free
asset).
The expected return for the portfolio of risky assets A and B is
\[
E(r_{p}) = w_{a}E(r_{a}) + (1-w)E(r_{b})
\] where \(w_{a}\) is the weight
on stock A.
The variance and standard deviation of the portfolio
are:
\[
\begin{aligned}
\sigma^{2}(r_{\text{p}}) &= \text{Var}\big(w_{a}r_{a} +
(1-w_{a})r_{b}\big)\\
&= w_{a}^{2}\sigma^{2}(r_{a}) +
(1-w)^{2}\sigma^{2}(r_{b}) + 2w_{a}(1-w_{a})\sigma(r_{a}, r_{b})\\
&= w_{a}^{2}\sigma^{2}(r_{a}) +
(1-w)^{2}\sigma^{2}(r_{b}) +
2w_{a}(1-w_{a})\rho_{a,b}\sigma(r_{a})\sigma(r_{b})
\end{aligned}
\]
Two Risky Assets
- Now suppose assets A and B are characterized as follows:
- Consider the frontier of returns/standard deviations available by
allocating different amounts to the two assets (i.e. varying w)
- Notice the variance calculation now includes the correlation between
A and B
- Sheet MVfrontier.xls plots the frontier for different
correlations
2 Risky Assets (correlation \(\rho\) =1)
\[\rho_{a,b} = 1\]

2 Risky Assets (correlation \(\rho\) =-1)
\[\rho_{a,b} = -1\]

2 Risky Assets (correlation \(\rho\) =0)
\[\rho_{a,b} = 0\]

Two risky assets
- Given the available risk return trade‐offs, which portfolio provides
the highest utility level for a given investor?
- Depends on their risk aversion (A)!
- More risk averse investors will choose a relatively safer portfolio
- This changes once you have a riskless asset to invest in
Two Risky Assets

Two Risky Assets

The Optimal Risky Portfolio without riskless asset
Formally, we can derive it using calculus, as \[
\begin{aligned}
\max_{w_{a}} U(r_{p}) & = E(r_{p}) - \frac{1}{2} A
\sigma^{2}(r_{p})\\
\text{s.t.} \; E(r_{p}) &= E(r_{p}) = w_{a}E(r_{a}) +
(1-w)E(r_{b})\\
\text{and} \; \sigma^{2}(r_{\text{p}}) &= w_{a}^{2}\sigma^{2}(r_{a})
+ (1-w)^{2}\sigma^{2}(r_{b}) +
2w_{a}(1-w_{a})\rho_{a,b}\sigma(r_{a})\sigma(r_{b})
\end{aligned}
\]
Solving directly (by plugging in and taking derivative w.r.t.
\(w_{a}\)):
\[
\begin{aligned}
w_{a}^{*} &= \frac{E(r_{a}) - E(r_{b})}{A(\sigma_{a}^{2} +
\sigma_{b}^{2} - 2\rho_{a,b}\sigma_{a}\sigma_{b})} +
\frac{\sigma_{b}^{2} - \rho_{a,b}\sigma_{a}\sigma_{b}}{\sigma_{a}^{2} +
\sigma_{b}^{2} - 2\rho_{a,b}\sigma_{a}\sigma_{b}}\\
&=\frac{E(r_{a}) - E(r_{b}) + A \big(\sigma_{b}^{2} -
\rho_{a,b}\sigma_{a}\sigma_{b}\big)}{A(\sigma_{a}^{2} + \sigma_{b}^{2} -
2\rho_{a,b}\sigma_{a}\sigma_{b})}
\end{aligned}
\]
- We can do better – just need a riskless asset
Two risky and one riskless asset
Add back our riskless asset
- \(r_{f} = 0.03\)
- \(\sigma(r_{f}) = 0\)
What is our Capital Allocation Line when we combine either A or B
with our riskless asset?
To illustrate more clearly the point, let \(\sigma(r_{A})=0.5\):
Two risky and one riskless asset

- The capital allocation line generated by asset A and the risk free
asset dominates that of asset B!
Two risky and one riskless asset

Finding the Efficient Risky Portfolio
- Note that the tangency portfolio can be defined as the portfolio on
the frontier that has the highest Sharpe Ratio (reward to risk
ratio)
\[
SR = \frac{E(r_{p} - r_{f})}{\sigma_{p}}
\]
- Subject to being on the portfolio frontier!
\[
\begin{aligned}
\max_{w_{a}} & \frac{E(r_{p} - r_{f})}{\sigma_{p}}\\
\text{s.t.} \; & E(r_{p}) = w_{a}E(r_{a}) + (1-w_{a})E(r_{b})\\
\text{and} \; & \sigma_{p}^{2} = w_{a}^{2}\sigma^{2}(r_{a}) +
(1-w)^{2}\sigma^{2}(r_{b}) +
2w_{a}(1-w_{a})\rho_{a,b}\sigma(r_{a})\sigma(r_{b})
\end{aligned}
\]
Finding the MVE Portfolio
- The solution for \(w_{a,MVE}\)
is:
\[
w_{a,MVE}^{*} = \frac{E(r_{a} - r_{f})\sigma^{2}_{b}- E(r_{b} -
r_{f})\sigma_{a}\sigma_{b}\rho_{a,b}}{E(r_{a}-r_{f})\sigma_{b}^{2} +
E(r_{b}-r_{f})\sigma_{a}^{2} - \big[E(r_{a}-r_{f}) +
E(r_{b}-r_{f})\big]\sigma_{a}\sigma_{b}\rho_{a,b}}
\]
Key thing to notice – this optimal portfolio does not include
anything related to investor risk aversion (\(A\))
Takeaway: All investors do best by choosing the
same risky portfolio and then deciding how much to allocate to
the riskless asset based on individual preferences
Mean-Variance Cookbook
Now we have a simple two step recipe for an optimal portfolio, based
on our taste for risk!
- Specify the expected returns, standard deviations and covariance
between our risky assets. Also define the riskless rate of return.
- using these parameters, solve for the unique MVE portfolio. This is
the tangency portfolio! Everyone wants this risky portfolio, regardless
of taste for risk.
- Choose weights between the MVE portfolio and the riskless asset
based on your taste for risk \(A\).
- Note: this statement holds for as many risky assets as we would
like!
Many (\(N\)) risky assets
Let’s do an example with the following stocks: Ford, IBM,
Microsoft, Netflix and Walmart.
Take their monthly returns from 2010-2018. What’s their average
monthly return and standard deviation? What about the Sharpe Ratio?
Historical Returns for 4 stocks (2010-2018)
|
Ticker
|
Monthly Return
|
Monthly SD
|
Sharpe Ratio
|
|
F
|
0.006
|
0.075
|
0.086
|
|
IBM
|
0.004
|
0.047
|
0.083
|
|
MSFT
|
0.016
|
0.063
|
0.250
|
|
NFLX
|
0.054
|
0.180
|
0.300
|
Many Risky Assets

Many Risky Assets

Many Risky Assets

Many Risky Assets (adding assets helps diversify)

So why does diversification shift the frontier?
Diversification benefits
So why does diversification shift the frontier?
- Adding assets can never make us worse off (if they did, we would put
zero weight in them)
- Can only help reduce risk via imperfect or negative correlation
- In spite of being risk averse, we’d like to get exposure to as many
different types of risks as possible
Correlation of Returns for 5 stocks (2010-2018)
|
F
|
IBM
|
MSFT
|
NFLX
|
WMT
|
|
1.000
|
0.279
|
0.357
|
0.210
|
0.159
|
|
0.279
|
1.000
|
0.339
|
0.107
|
0.221
|
|
0.357
|
0.339
|
1.000
|
0.216
|
0.135
|
|
0.210
|
0.107
|
0.216
|
1.000
|
-0.066
|
|
0.159
|
0.221
|
0.135
|
-0.066
|
1.000
|
In practice, how do we construct a frontier?
- The mutual fund separation theorem says the entire “bullet” can be
constructed by any two efficient portfolios on the bullet
- Tangency or MVE portfolio
- Found by maximizing the Sharpe Ratio
- Minimum Variance Portfolio (MVP)
- Find by minimizing the standard deviation
- With no constraints, can solve exactly with calculus (or solver)
- With constraints, need to use numerical solver
- Then, with these two portfolios, just vary the weights on each (as
in our two risky asset setting)
Finding the Minimum Variance Portfolio
- Instead of maximizing the Sharpe ratio, what if we want to minimize
risk ignoring returns?
- Obviously, if we have the riskless asset, we invest exclusively in
that.
- What about just in the risky assets?
- In two asset case, problem is defined as:
\[
\begin{aligned}
\min_{w_{a}} \sigma^{2}_{p} &= w_{a}^{2}\sigma^{2}(r_{a}) +
(1-w_{a})^{2}\sigma^{2}(r_{b}) +
2w_{a}(1-w_{a})\rho_{a,b}\sigma(r_{a})\sigma(r_{b})\\
w_{a, MVP}^{*} &= \frac{\sigma_{b}^{2} -
\sigma_{a}\sigma_{b}\rho_{a,b}}{\sigma_{a}^{2} + \sigma_{b}^{2} -
2\sigma_{a}\sigma_{b}\rho_{a,b}}
\end{aligned}
\]
- For N assets, the problem is defined using vectors (optional):
\[
\begin{aligned}
\min_{\mathbf{w}} \sigma^{2}_{p} &= \mathbf{w}' \Sigma
\mathbf{w}\\
\text{s.t.} \; \mathbf{w}'\mathbf{1} &= 1\\
\end{aligned}
\]
Parameter Uncertainty
- So far, portfolio optimization looks easy
- But, we have assumed we know the right inputs to plug into our
model
- In practice what are the expected returns?
- Similar, but less severe issues arise with covariances
- See Garlappi, Demiguel, Uppal 2007 RFS (on course website)
Parameter uncertainty – an extreme example
- Suppose two assets have expected returns of 8%,and standard
deviations of 20% and a correlation of 0.99.
- What is the optimal portfolio?
- 50/50 by symmetry
- Now suppose we get more data, and realizxe that one asset has an
expected return of 9%
- Our new weights would be 635 and -535
- Small errors can have extreme effects!
How do we deal with this?
- Models
- CAPM/APT serve as models of expected returns and give us theoretical
guidence on parameter inputs
- Constraints
- Extreme mistakes in parameters will generate extreme portfolios
- Constraining weights can minimize this risk
- Bayesian analysis
- Supplement data with prior beliefs about the model inputs
- Model-based, or just naive
- Machine-learning
- Improve our predictions
- Minimize our mean-squared error of our parameters
Bayesian Methods
- What are Bayesian methods?
- Suppose I have two signals (\(s_{1}\) and \(s_{2}\)) about a value, \(\mu\)
- Each signal is distributed Normally, with a mean of \(\mu_{i}\) and a variance of \(\sigma_{i}^{2}\) (i = 1, 2,
respectively)
- The “best” forecast of the parameter \(\mu\) is a precision-weighted average of
\(s_{1}\) and \(s_{2}\):
\[
\begin{aligned}
\hat{\mu} &= \frac{s_{1}\sigma_{2}^{2} +
s_{2}\sigma_{1}^{2}}{\sigma_{2}^{2} + \sigma_{1}^{2}}\\
&= \frac{\frac{1}{\sigma_{1}^{2}} s_{1} +
\frac{1}{\sigma_{2}^{2}}s_{2}}{\frac{1}{\sigma_{1}^{2}} +
\frac{1}{\sigma_{2}^{2}}}\\
\end{aligned}
\]
Bayesian Methods
- Suppose we take historical returns as informative about the future
- but we don’t want to overweight them because they’re imprecise
- Use your prior beliefs (either your own personal feelings, or from
other sources) as your second signal, and weight based on precision
- For example, expected returns estimates might be a blended average
of historical averages and constant returns
- Historical record weighted based on precision
- Often called parameter “shrinkage”
Bayesian Methods
- Optimal Bayesian weighting can be shown to shrink historical returns
towards the minimum-variance portfolio (the tip of the bullet)
\[
\begin{aligned}
\hat{\mu}_{JS} &= (1- \hat{w})\bar{X} + \hat{w} \mu \mathbf{1}\\
\mu &=
\frac{\mathbf{1}'\Sigma^{-1}\bar{X}}{\mathbf{1}'\Sigma^{-1}\mathbf{1}}\\
\hat{w} &= \frac{N+2}{(N+2) + (X - \mu\mathbf{1})'T\Sigma^{-1}(X
- \mu\mathbf{1})}
\end{aligned}
\]
- where \(X\) is our empirically
observed average returns and \(\mu\) is
the return of the MVP
- \(\mu\) is not explicitly designed
to be the MVP return, but it happens to be the portfolio which minimizes
variance
- N is the number of stocks, T is the number of time periods and \(\Sigma\) is the covariance matrix
(estimated too)
- What’s the intuition?
Bayesian Methods
Taken to an extreme, this gives us some naive portfolios
For example, what happens if I put all my weight on the MVP prior
(e.g. returns are flat across all stocks)
- What portfolio do you think I’ll get?
- Why would I put all my weight on the simple prior?
Note the MVP (a “naive” portfolio) can be motivated by Bayesian
methods
Bayesian Methods
- Of course, just like means, covariance matrix can be shrunk as well
- What portfolio do you get from extreme shrinkage of expected returns
and covariances?
- Meanwhile, can show that weight constraints are equivalent to
shrinking the covariance matrix for the assets affected by constraints
- Jaganathan and Ma “Risk Reduction in Large Portfolios”, Journal of
Finance
Constraints

Constraints
- When considering constraints, think about what underlying
fundamentals are implied by the constraints
- i.e. “what expected returns would I have to assume to make a
constrained portfolio also be the optimal unconstrained portfolio?”
- Next, we learn how to invert the portfolio choice process, turning
chosen weights into implied expected returns